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s  Many  of  the  classes  of  computer  graphics  algorithms  and  polygon 
storage  schemes  can  be  adapted  for  parallel  execution  on  various  parallel 
architectures.  The  connection  machine  is  one  such  architecture  that 
should  be  thought  of  as  a  multiprocessor  grid  that  can.be  reconfigured  into 
standard  2-dimensional  mesh  and  n-dimensional  hypercube  architectures. 
The  classes  of  algorithms  considered  in  this  paper  are  SPLINES;  POLYGON 
STORAGE;  TRIANGULARIZATION;  and  SYMBOLIC  INPUT.  J 


2.  ARCHITECTURE  OVERVIEW 


—  The  target  Connection  Machine  (hearafter  designated  as  CM)  for  the 
algorithms  of  this  paper  has  8192  physical  processors.  Each  physical 
processor  has  8  kilobytes  of  local  memory  plus  an  arithmetic-logic  unit. 
All  processors  can  communicate  with  any  other  processor  through  a 
router.  Thus  this  CM  has  a  shared  memory  of  64  megabytes  when  used  as  a 
standard  multiprocessor  (MIMD)  architecture.  In  addition,  the  CM 
interconnection  structure  can  simulate  a  2-dimensional  mesh  and 
n-dimensional  hypercube  (SIMD)  architecture  with  the  mesh  being  the 
default  architecture.  The  front  end  for  the  CM  is  a  Symbolics  and  the  high 
level  language  is  LISP  or  FORTRAN,  f  ^  ^  1  ^ _ _ 


What  are  the  advantages  of  the  CM  for  computer  graphics  algorithms? 
The  primary  advantage  is  the  reconfigurability  of  the  CM.  This 
reconfigurability  is  under  program  control  and  can  be  initiated  by  LISP 
program  statements.  Within  a  LISP  program  interprocessor  communication 
can  be  mixed  within  a  program  block,  allowing  SIMD  and  MIMD  execution 
within  the  same  algorithm.  Thus  the  best  algorithm/architecture 
combination  can  be  redily  optimized  for  a  wide  variety  of  computer 
graphics  problems.  For  FORTRAN  the  interprocessor  communication 
mixture  is  at  a  lower  level  than  LISP  and  is  mostly  related  to  the  layout 
of  array  storage,  which  is  where  the  mixture  takes  place. 


.3.  SPLINE  ALGORITHMS 


The  following  algorithms  relate  to  the  class  of  problems  dealing  with 
curve  and  surface  interpolation.  The  interpolation  functions  discussed  are 
the  B-SPLINE  functions.  These  functions  are  calculated  using  recurrence 
relations.  The  recurrence  relations,  in  turn,  use  two  parameters  for  curve 
interpolation  and  three  parameters  for  surface  interpolation.  The 
recurrence  relations  produce  what  are  called  blending  functions.  These 
blending  functions  can  be  thought  of  as  weight  functions  that  indicate 
percentage  contribution  to  the  final  interpolation  value  of  the  designated 
points  that  are  being  interopolated.  The  inverse  to  the  interpolation 
problem  is  the  production  of  the  designated  points  from  function  values. 
The  function  values  for  the  inverse  problem  can  be  input  data  values  or 
data  values  derived  from  an  algebraic  equation.  The  designated  points  are 
called  control  pointes.  The  B-SPLINE  functions  have  several  properties. 
The  most  important  property  for  parallel  evaluation  of  the  function  values 
is  the  propoerty  of  local  control.  Essentially,  each  control  point  affects 
the  values  of  the  interpolating  B-SPLINE  within  a  narrow  range  of  blending 
function  parameter  values. 

3.1  B-SPLINE  CONVENTIONS 

NOMENCLATURE: 

o  =  order  of  the  B-SPLINE  curve  or  surface. 

s  =  number  of  parameter  steps  between  integer  values  of  u  and  v. 

S  =  increment  step  value  of  parameter  =  1/s. 

n  =  number  of  control  points  to  be  interpolated  by  a  3-d  curve  using 
the  blending  function  parameter  u. 

nu  =  number  of  control  points  to  be  interpolated  by  a  3-d  surface 
using  the  blending  function  parameter  u. 

nv  =  number  of  control  points  to  be  interpolated  by  a  3-d  surface 
using  the  blending  function  parameter  v. 

Nj  0  (u)  *  B-SPLINE  blending  function  for  the  3-d  curve  or  surface 
using  the  blending  function  parameter  u.  (order  o) 

Nj  0  (v)  =  B-SPLINE  blending  function  for  the  3-d  surface  using  the 
blending  function  parameter  v.  (order  o) 


tj  =  knot  values  relating  the  blending  function  parameters  u  and  v  to 
the  control  points  of  the  curve  or  surface. 

B  (u)  =  B-SPLINE  3-d  curve. 

B  (u,v)  =  B-SPLINE  3-d  surface. 

Pj  *  control  point  i  for  the  3-d  curve  B  (u). 

Pj  j  =  control  point  (i,j)  for  the  3-d  surface  B  (u,v). 

RANGES  OF  THE  PARAMETERS: 

The  parameter  u  (3-d  curve):  O^.u^n-o+2 

The  parameter  u  (3-d  surface):  0£U£.nu-o+2 
The  parameter  v  (3-d  surface):  0^.v^.nv-o+2 
Knots  tj  (O^kn+o  ,  0^k.nu+o  ,  0^nv+o): 


tj 

=  0 

if  i<o 

tj 

=  i-o+1 

if  o<Lkn  or  o^i£.nu  or  o^.knv 

tj 

=  n-o+2 

if  i>n 

nu-o+2 

if  i>nu 

nv-o+2 

if  i>nv 

B-SPLINE  RECURRENCE  RELATIONS: 

(i)  Njj  (u)  =  1.0  if  tj<LU<ti+1 
=  0.0  otherwise 

(II)  Mjj  (v)  =  1.0  if  tj^.v<tj  +  i 
=  0.0  otherwise 


(u-tj)  *  Nj  k.-,  (u)  (tj+k-u)  *  ^i+l.k-1  (u) 

(Oil)  Ni>k  (u)  .  +  . . 

fj+k-1  "  *i  *i+k  ‘  *i  +  1 

(v-tj)  *  Mjpk.1  (v)  (tj+k-v)  *  ^j+l.k-1  (v) 

(IV)  Mj(k  (v)  .  +  . 


■ 

• 

1 

* 

tj+k-1  *  *j  *j+k  *  tj+1 

1 

n 

■ 

(V) 

B  (u)  -  Z  Pi  *  Ni(0  (u) 

1 

i=0 

1 

1 

n  m 

1 

(VI) 

B  (u,v)  -Z  Z  Pj,j  *  N,i0  (u)  *  Mj  o  (v) 

1 

i*o  j-o 

1 

3.2 

SEQUENTIAL  SPLINE  ALGORITHMS 

1 

Here  is  a  sequential  algorithm  for  calculating  a  B-SPLINE  curve: 

1 

(u-tj)  *  Ni  k.,  (u)  (ti+k-u)  *  Mi+1  k_,  (u) 
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*i+k-1  *  *i  *i+k  *  *i+1 

■ 

1.  FOR  O^u^n-o+1  step  1/s  do 

1 

2. 

Find  i  such  that  tj^u^tj+i 

■ 

3. 

Nj(1  (u)  -  1.0 

1 

4. 

FOR  2^k<:o  do 

5. 

FOR  1  <j<k  do 

1 

6. 

1  -  i+1-j 

7. 

^  ( *i+k-1  ‘  *i)  anc*  (  tj+k  '  *i+l)  =  0 

1 

8. 

Ni  k(u)  -  0 

9. 

ELSE  IF  ( tj+k.-|  -  tj)  and  (  tj+k  -  tj+-j)  <>  0 

1 

10. 

Ni,k  (u)  “  A  +  B 

11. 

ELSE  IF  ( ti+k.-,  -  tj)  -  0 

1 

12. 

Mj,k  (u)  “  ® 

13. 

ELSE  IF  ( tj+k  -  tj+-|)  -  0 

1 

14. 

Nj,k  (u)  -  A 

15. 

END  IF 

1 

16. 

END  FOR 

17.  END  FOR 

18.  FOR  i+1-o<sum<i  do 

19.  B  (u)  =  B  (u)  +  P sum  *  ^sum,o  (u) 

20.  END  FOR 

21.  END  FOR 

The  loop  in  step  1  is  executed  C  =  (n-o+1)  *  s  +  1  times,  where  s  is  the 
number  of  steps  between  integer  increments  in  the  parameter  u.  C  is  the 
number  of  curve  points  calculated  by  the  algorithm.  Since  there  are  3 
rules  for  knot  values,  n  control  points  and  2  knot  values  to  evaluate,  step 
2  is  O(n).  Calculating  the  blending  function  in  steps  4  and  5  is  O((o-1)*o). 
The  final  summation  to  produce  the  curve  value  is  O(o).  It  should  be  noted 
that  the  value  of  o  is  for  all  practical  purposes  3  or  4  in  most  computer 
graphics  applications.  Since  n  »  o  and  n  »  s  in  general,  the  complexity  of 
the  sequential  algorithm  is  O(n).  The  low  value  of  o  is  due  to 
considerations  of  the  stability  of  interpolation  algorithms  in  general.  The 
value  of  s  depends  upon  the  number  of  points  or  polygons  to  be  plotted. 
Thus,  if  the  value  of  s  is  greater  than  n,  then  the  complexity  of  the 
algorithm  is  0(n2)  or  greater. 


3.3  PARALLEL  SPLINE  ALGORITHM!  DISCUSSION 

The  most  important  property  of  splines  that  facilitates  parallel 
algorithm  development  is  the  local  control  property.  Only  a  finite  subset 
of  control  points  affect  the  value  of  the  B-SPLINE  curve  or  surface.  This 
property  is  reflected  in  steps  2  and  3  of  the  sequential  algorithm.  The 
effect  of  this  property  of  local  control  is  to  reduce  the  number  of 
processors  needed  by  the  parallel  algorithm.  Examination  of  steps  2  and  3 
in  the  sequential  algorithm  indicate  that  most  of  the  blending  functions 
for  a  particular  parameter  u  will  be  zero.  The  number  of  nonzero  blending 
functions  is  equal  to  the  order  of  the  B-SPLINE  curve.  Blending  functions 
Nj  0  (u),  Nj_i  0  (u) . .  Nj_0+ii0  (u)  are  nonzero  where  i  is  the  value  found 

in  step  2  of  the  sequential  algorithm.  In  addition,  the  combination  of  this 
property  and  the  structure  of  the  recurrence  relation  for  the  blending 
functions  allows  for  the  development  of  a  set  of  recurrence  relations 
relating  the  final  nonzero  set  of  blending  functions  of  order  o  to  the 
blending  function  Nj  *j  (u)  calculated  in  step  3  of  the  sequential  algorithm. 

This  set  of  recurrence  relations  allows  for  the  parallel  calculation  of  all 
of  the  blending  functions  for  all  parameter  values  in  1  operation.  Thus,  the 
loop  in  step  1  is  eliminated.  The  loop  that  calculates  the  final  sum,  step 
18  in  the  sequential  algorithm,  can  be  executed  in  parallel  in  O(o)  steps. 
Step  2  of  the  sequential  algorithm  can  be  executed  on  the  CM  using  1 
parallel  FORTRAN  or  LISP  statement  for  all  values  of  the  parameters  u.  In 


.  order  to  reduce  the  number  of  processors  required,  o  sets  of  variables  are 
needed.  The  use  of  these  extra  variable  sets  reduces  the  processor 
requirement  from  0(o*s*(n+o))  to  0(s*(n+o))  in  the  worst  case.  In  other 
words,  the  number  of  processors  required  is  approximately  equal  to  the 
number  of  points  to  be  plotted  by  the  algorithms  computer  graphics 
application.  The  parameters  u  and  v  have  no  intuitive  graphics  function  and 
only  serve  to  generate  points. 

The  following  discussion  relates  to  the  development  of  the  parallel 
B-SPLINE  recurrence  relation  set.  This  set  is  developed  for  the  case  of 
B-SPLINE  functions  of  order  3  which  is  a  suitable  spline  order  for  most 
computer  graphics  applications.  The  B-SPLINE  recurrence  is  a  2nd  order 
linear  recurrence.  Consider  the  above  recurrence  equations  1SI  and  iV. 
These  recurrence  equations  are  of  the  form: 

Nl,k  (u)  -  x*Nl,k-1  <u>  +  y*Nl+1  ,k-1  <u> 

Now  substitute  for  IMI |+ 1  ,k-1  (u)>  resulting  in  the  following: 

Nl+1,k-1  (u)  =  x1*Nl+1,k-2  (u)  +  yi*Nl+2,k-2  (u) 

and  finally 

Nl,k  (u)  *  x2*Nl,k-1  (u)  +  y2*wl+1,k-2  (u) 

This  process  could  be  continued  until  k-2  <  1  at  which  point  the 
blending  function  subscript  would  be  meaningless.  For  the  B-SPLINE  of 
order  3  only  1  iteration  of  this  process  is  necessary.  Due  to  the  local 
control  property  the  blending  function  of  order  k-1  is  really  a  function  of 
order  k-2  only  since  it  is  itself  a  combination  of  blending  functions  of 
order  k-1  and  k-2  with  the  order  k-1  function  equal  to  zero.  Also  due  to 
the  local  control  property  only  blending  functions  Nj  0  (u),  Nj.-j  0  (u),...., 

Nj.o+i  o  (u)  neecl  t0  136  calculated.  The  result  is  a  set  of  recurrence 
relations  of  the  form: 


(VII) 

^i,o  = 

z1*Ni,o-2  (u) 

(Viil) 

Nj-1  >0  (u) 

=  z2*Ni,o-2  (u)  + 

z3*Ni,o-2  (u) 

(IX) 

Ni-2,o  (u) 

-  z4*Ni,o-2  (u) 

where 

(U  -  tj) 

(u  -  tj) 

(*i+o-1  "  *i)  (tj+o-2  *  *i) 


z2  » 


(*1-2+0-  u> 


(u-tM) 


(V2+o  '  V 


(*i+o-2  ‘  Vi ) 


z3 


z4  = 


(u  -  tj) 

(*i+1  *  *i) 

(V2+0  ‘  u) 
(V2+0  "  Vi ) 


(Vi  +o”  U) 

(Vi  +o  ■  V 

(V2+0  •  u) 
(V2+0  ■  V 


From  step  2  of  the  sequential  algorithm,  the  blending  function  Nj  0_2  (u) 
is  Nj  -j  (u)  and  is  equal  to  1.  It  is  obvious  that  the  above  set  of  recurrences 
for  the  parallel  B-SPLINE  algorithm  does  not  involve  blending  functions  of 
lower  order  and  so  all  of  the  values  of  Nj  0  (u)  can  be  calculated 

simultaneously  for  all  values  of  u.  Thus,  the  value  of  the  B-SPLINE  curve 
for  a  particular  value  of  u  can  be  calculated  using  3  processors  (for 
B-SPLINES  of  order  4  it  would  take  4  processors). 

3.4  PARALLEL  SPLINE  CM  ALGORITHM 

First  consider  processor  interconnection  and  geometry  for  the  parallel 
spline  3-d  curve  algorithm.  The  geometry  is  a  rectangular  grid  of 
dimension  n+o  columns  by  s  rows.  The  constant  s  is  generally  smaller  than 
n  but  can  be  arbitrarily  large,  although  limited  by  the  resolution  of  the 
display,  which  is  in  reality,  the  closest  distance  between  displayable 
points. 


Each  processor  in  a  column  of  the  rectangular  grid  contains  the 
blending  function  variables  Nco)(j)f0  (u),  N1^^  0  (u)  and  N2co|(i)  o  (u). 

Three  different  variables  are  used  to  reduce  processor  usage  for  the 
following  reason.  The  index  i  of  the  nonzero  blending  function  set  is 
calculated  from  the  integer  valued  knot  functions.  Due  to  the  use  of  s 
rows,  each  column  of  the  grid  is  calculating  the  blending  functions  for 
only  0  integer  values  of  u  in  a  particular  row.  Therefore  only  0  different 
blending  variables  need  to  be  stored  per  processor.  Consider  the  case  of  n 
-  7  control  points  and  blending  functions  of  order  3.  For  0£.u<1  ,  i  -  3  and 
only  the  first  3  columns  participate  in  the  calculation.  For  1^.u<2  ,  i  -  4 

and  only  columns  2,3  and  4  participate  in  the  calculation.  For  2^.u<3  ,  i  -  5 

and  only  columns  3,4  and  5  participate  in  the  calculation.  For  3^.u<4  ,  i  -  6 

and  only  columns  4,5  and  5  participate  in  the  calculation.  For  4<u  none  of 


the  first  3  columns  participate  in  the  calculation.  Thus  each  processor  in 
a  particular  row  and  column  calculates  the  blending  function  set  for  3 
distinct  values  of  u. 

Each  processor  contains  a  variable  set  Uj  ^  u^jk  u2j  k .  For  each 
processor  at  grid  position  (j,k): 

uj,k  -  0  +  (H)*S  if  k  <.  o  uj  k  =  k-o  +  (j-1  )*S  if  k>o 

u1  j.k  =  1  +  if  k  <.o  u\k  =  k-o  +1  +  0-1  )*S  if  k>o 

u2j  k  =  2  +  G_1)*S  if  k  s.o  u2jk  *  k-o  +  2  +  G-1)*S  if  k>o 

plus  an  initialization  condition  to  match  the  values  to  the  pattern  of 
Figurel . 

For  o*4  there  would  be  4  variables  in  the  variable  set  for  u.  The  reason 
for  the  above  distributation  of  Ujk  's  among  the  processors  is  similar  to 

the  explanation  for  the  Ncol(i),o 's- 

Each  of  the  processors  in  the  rectangular  grid  contains  a  variable  tjk 
where  tjk  assumes  the  following  values: 

tj  k  *  0  if  k  <  o  +  1 

tjk  ■  k  ‘  (°  +1)  +  1  if  o  +  1  k  £  n 
tj  k  =  n  -  (°  +  1)  +  2  if  n  <  k 

Here  is  the  3-d  parallel  spline  curve  algorithm  for  the  CM: 

0)  FORALL  Pj  k  ,  1  <l  j  £  s  ,  1  £  k  <  n  DO 
INPUT  (Pj(k) 

END  FOR  ALL 

1)  FORALL  Pj  k  i  1  ijiS,  1  ik^n  DO 
IF  k  £  o 

uj,k  -  0  +  0-1  )*S 
u’j.K  -  1  +  0-1  )*S 
u2j,k  .  2  +  0-1  )*S 
ELSE 

uj,k  -  k-o  +  0-1  )*S 
^j.k  -  k-o  +1  +  (j-1  )*S 
u2j  k  -  k-o  +  2  +  0-1  )*S 
IF  k  -  2 
u1i,k-° 


u2j,k  - 1 
IF  k  -  1 

U>  =  ° 
u 2i,k  -  0 

END  FORALL 

2)  FORALL  Pjk  JijiS.liki  n+o  DO 

IF  k  <  o  +  1 
*j,k  -  0 

IF  o  +  1  £  k  n 
tj.k  -  k  '  (o+1)  +  1 
IF  n  <  k 

tj)k  =  n  -  (o+l)  +  2 
END  FORALL 

3)  FORALL  Pj  k  ,1^j^s,1^k^.nD0 

IF  k  <  3  THEN  i  =  3 
ELSE  i  - k 

l1  * 

*2  48  1  j ,  i + 1 
{3  tj.i+o-l 
U  48  tj,i+(o-1  )-1 
*5  48  lj,i-2+o 
t6  ^  1  j ,  i  - 1 

IF  ( t3  -  t-j )  OR  (t4  -t!  )  »0 

Mj,k  (uj,k)  -  0 
ELSE 

(Uj.k-tl  )  (uj,k  ‘  h ) 

Nj,k  (uj,k>  =  .  *  . 

( t3  -tO  «4-tl) 

IF  (  t4  - 1! )  OR  ((  t4  -t6)  AND  (  t2  - 1-, ))  -  0 


Wj.fc  (u\k)  =  0 

ELSE  IF  (( t4  - 11)  AND  ((4  -  tg)  AND  (  t2  -  ti))<»  0 

(U  -  u1j,k)  (u1  j,k  ”  *6> 

N1j,k  (u1  j.k>  .  *  . 

(t4-‘i)  (t*  -  %) 

(t3-u1j,k)  (u1  j,k  -  ‘1) 

* 

+  . 

(t3  -  *i  >  (t2  - 1, ) 

ELSE  IF  (( t3  - t,  )OR  (ta-t,))*  0 

<u1],k  -  *6)  (U-u’j.k) 

N^.k  (u’j.k)  .  *  - 

( U  -  tg)  (U  '  li ) 

ELSE  IF  ((t4  - 11 )  OR(  t4  -  %))  =  0 

(*3  "  1,1  j,k)  <u1j,k-'l) 

N’j.k  (u1  j.k>  *  .  *  - 

(t3  - 1, )  (t2-t,) 

IF  «t5  -  tg)  OR  (t5  - 1, ))  -  0 

w2j,k  (  u2j,k)  -  0 
ELSE 

(*5  '  u2j,k)  <‘5  -  u2j,k) 

N2j,k<u2j,k>  -  -  ‘  -  ■ 

( ts  -  %)  (*5  -  ti ) 

END  FORALL 

4)  FORALL  Pj  k  ,1^j^s,3ik^nDO 
cj,k  =  pj,k 

c  1  j.k  *  pj,k-1 

c2j,k  *■  pj,k-2 


^j.k  “  cj,k  *  Mj,k  (uj,k) 

w1i,k-  c1j,k  *  n1  j.k  (“’j.k) 
W2j,k-  C2j,k-N2j  k  (u2j  k  ) 


END  FORALL 

5)  FORALL  Pj  k  .  1  £  j  £  s  ,  3  £  k  ^  n  DO 

wV*  W’i.k-I 

W2j,k  *  W2j,k.2 

Bj,k  =  Wj,k  +  Wljk  +  W2jk 
END  FORALL 

NOTES: 

1)  The  <=  symbol  indicates  communication  over  the  NEWS  interconnection 
network  of  the  CM.  The  =  symbol  is  the  standard  assigment  operator.  The 
Pj,k  symbol  denotes  the  processor  at  position  (j,k)  in  the  rectangular  grid. 
Also  the  points  Pjj  for  each  row  index  are  equal  (i.e.  P24  -  P34  ,  etc.) 

2)  The  FORALL  statement  is  in  pseudocode  in  the  algorithm  body.  The  CM 
FORALL  syntax  for  line  0  is: 

0)  FORALL  (j  -  1  :s,  k  -  1  :n). 

The  FORALL  statement  is  not  yet  supported  for  the  CM  FORTRAN  version 
5.1 -0.5  (June  1989).  Thus  the  WHILE  statement  must  be  used  with  a  mask 
instead  of  the  subscripts.  This  CM  mask  is  an  array  mask  so  an  extra 
index  array  is  needed.  The  syntax  for  line  0  is: 

0)  WHERE  ( 1  <=  J  <=  S  .  AND.  1  <-  K  <«  n  ). 

Note  that  J  and  K  are  mask  arrays  initialized  with  the  rectangular  grid 
indicies.  Figure  3  gives  an  illustration  of  the  grid.  The  J  and  K  mask 
arrays  can  be  set  up  using  the  FORTRAN  DATA  declaration  statement. 

3)  All  of  the  pseudocode  variables  in  the  above  algorithm  are  really  CM 
arrays  when  programmed  in  FORTRAN.  The  actual  procedure  is  to  create  a 
2-D  virtual  processor  set.  For  the  small  examples  in  the  paper  each  virtual 
processor  is  a  physical  processor.  The  variables  are  declared  as  arrays. 

4)  Only  a  n*s  subgrid  is  computing.  The  extra  3  columns  are  used  to  hold 
the  knot  values  for  i=n. 


3.5  OPERATION  OF  THE  CM  ALGOR8THM 

Consider  the  case  of  n  *  7  and  s  =  4.  The  symbol  S|  0  indicates  the 

blending  functions  developed  in  section  3.3.  Consider  the  following 
figures.  Figure  1  shows  the  arrangement  of  the  values  of  the  u  parameter 
set  variables  for  the  first  row  of  the  processor  grid.  Notice  that  the 
values  in  the  following  set  varaiables  are  equal:  uj  ^  ,  u1j,k-1  anc*  u^j,k-2- 

Consider  figure  2  which  shows  the  blending  function  calculations  for  the 


first  row  of  the  processor  grid.  Row  1  is  illustrated  since  it  serves  to 
illustrate  the  principles  behind  the  algorithm  structure.  These  figures 
illustrate  the  rational  for  the  program  statements  in  step  3  of  the 
parallel  algorithm.  Figure  3  shows  an  overview  of  the  processor  array 
from  a  theoretical  standpoint.  Consider  the  calculation  of  B24-  Now  the 

value  of  u  is  1.5.  The  blending  function  set  that  is  nozero  for  this 
parameter  value  consists  of  S|  0  ,  S|_i  0  ,  and  S|_2,o  where  I  =  4  and  o  = 

3.  Now,  3  processors  are  being  used  to  calculate  B24.  Now  P,GURE  1,  with 

2  substituted  for  the  processor  and  variable  subscripts,  and  with  1/2 
added  to  all  of  the  values  would  illustrate  the  parameter  distribution  for 
row  2  calculations.  Remember  that  P24  is  calculating  portions  of  the 

functions  required  for  B24  »  ®25  ancl  ®2  6-  The  ®24  calculation  is 
involving  the  variable  N24  (U24)  in  P24  which  corresponds  to  the 
recurrence  relation  (VII).  It  also  involves  the  variable  N  ^3  (1^23)  which 
corresponds  to  the  recurrence  relation  (VIII)  which  is  in  P23  and  the 
variable  N222  (u222)  which  corresponds  to  recurrence  relation  (IX)  which 
is  in  P22-  FIGURE  2  helps  to  illustrate  the  mapping  between  the  recurrence 
relations  (VII)  ,  (V DBS)  ,  (IX)  and  the  N's  (the  recurrence  relation 
equivalent  is  denoted  by  an  S).  Note  that  this  scheme  does  not  calculate 
the  last  u  parameter  value.  This  value  can  be  calculated  and  stored  in  any 
one  of  the  processors  after  by  executing  the  above  parallel  algorithm  with 
minor  modifications.  This  last  value  may  be  needed  for  continuity  of 
spline  surfaces  'stiched'  together.  It  should  also  be  noted  that  the  NEWS 
interconnection  facility  of  the  CM  can  not  be  used  for  the  parallel  curve 
algorithm,  but  that  the  arrays  can  be  declared  with  weights  using  the 
LAYOUT  directive  to  optimize  communications  between  a  processor  and 
it's  neighbors  for  a  distance  of  3  processors,  which  is  the  maximum 
needed  to  fetch  the  knot  values. 
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3.6  DUSCUSSION  OF  THE  SURFACE  ALGORITHM 

The  strategy  for  the  parallel  calculation  of  a  spline  surface  involves 
the  transformation  of  the  variables  in  the  parallel  curve  algorithm  into 
vectors.  Thus,  all  of  the  required  blending  functions  for  all  u  and  v 
parameter  values  can  still  be  calculated  simultaneously.  A  subset  of  the 
control  point  net  equal  to  nu  x  nv  will  contribute  to  the  final  value  of  the 

spline  surface  point  for  a  particular  pair  of  u  and  v  parameter  values. 
Similarly,  as  in  the  curve  parallel  spline  algorithm,  steps  4  and  5  need  to 
use  the  general  CM  router  for  communication  between  arbitrary 
processors  in  the  grid.  The  control  points  that  contribute  will  live  in  the 
Pjj  ,  Pj^-i  and  Pjj-2  vectors.  Their  positions  in  the  vectors  will  be  at 

positions  f,  f-1  and  f-2.  The  i  value  of  the  vector  is  the  index  of  the 
nonzero  blending  functions  for  the  parameter  nu.  The  f  value  in  the  vector 

is  the  index  of  the  nonzero  blending  functions  for  the  parameter  nv.  Note 

that  vectors  can  be  used  to  reduce  communication  needs  between 
processors  in  the  grid  at  the  expense  of  increased  storage. 

Note  that  the  length  of  the  vectors  is  m  (given  an  nxm  control  point 
grid)  for  the  Pjj  's  and  2  for  the  Uj  ^  's  and  the  Njj 's.  Thus  you  have  to 

place  vectors  of  this  length  in  each  processor  in  the  grid  for  the  control 
points.  For  steps  4  and  5,  two  sets  of  arrays  are  used  to  store  the  Uj  ^ 's 

and  the  Njj  vectors.  The  CM  allows  for  variables  that  consist  of  vectors 

to  be  distributed  throughout  the  processor  grid.  Note  that  the  arrays  for 
the  curve  algorithm  are  1  dimensional  arrays  set  up  using  the  CM  LAYOUT 
compiler  directive  to  declare  the  arrays  parallel.  For  the  surface 
algorithm  the  LAYOUT  directive  is  used  to  develop  2  dimensional  arrays. 
The  first  dimension  (or  AXIS)  is  declered  SERIAL  while  the  2nd  dimension 
is  declared  parallel  (declared  SEND  or  NEWS).  For  the  example  of  the 
processor  grid  in  this  paper,  the  nxm  control  point  grid  is  really  nxn.  Note 
that  any  array  with  a  serial  dimension  is  a  vector  allocated  to  each 
processor  in  the  processor  set. 

Basically,  steps  0,1,2  and  3  of  the  surface  algorithm  are  similar  to 
steps  0,1,2  and  3  of  the  curve  algorithm  in  section  3.4.  The  main 
difference  is  in  steps  4  and  5.  Steps  0,1,2  and  3  just  require  the  correct  2 


dimensional  LAYOUT  dirrective  for  the  Pjj  's.  Note  that  the  u  parameter 
value  is  the  first  index  of  the  serial  AXIS  for  each  of  the  Uj  k  's,  and  the 
Njj 's.  Steps  0  thru  3  can  have  the  assignment  statements  replaced  with  a 

vector  assignment  for  each  processor  in  the  grid.  Also  note  that  the 
variables  t-|  thru  t0  are  used  for  both  the  u  parameters  and  the  v 

parameters.  Note  that  since  the  serial  dimension  is  2  one  can  use  CM 
arrays  designated  as  v  and  M  and  duplicate  the  blocks  containing  the 
formulas  in  step  3,  replacing  the  Ujk's  with  Vj  k's  and  the  Nj^'s  with 
IMlj  k's.  For  a  control  point  grid  of  unequal  dimensions,  the  processor  grid 

dimensions  would  be  based  upon  the  larger  of  the  control  point  grid 
dimensions.  The  CM  ALLIGN  directive  would  be  needed  to  line  up  the  Mj^'s 

with  the  Mi  k's.  Different  values  of  the  parameter  increment  variable  S 

could  be  used.  For  the  unequal  grid  sizes  or  variable  increment  steps  the 
surface  algorithn  would  have  to  be  modified  slightly. 

For  steps  4  and  5,  the  final  summations  to  produce  the  result  in  (VI) 
require  1  sequential  loop.  Basically,  each  of  the  INh  k's  in  the  processor  set 

of  three  processors,  is  combined  with  3  of  the  M^'s.  Inside  the  loop  each 

processor  calculates  a  B  (u,v)  value  for  a  fixed  value  of  v.  The  parameter  v 
is  the  sequential  loop  parameter.  Here  is  what  steps  4  and  5  look  like.  The 
variables  c  and  d  are  the  indexes  of  the  processor  set  containing  the  v 
parameter.  The  v  parameter  progresses  from  0.0  to  n  -o  +1  in  increments 
of  1/s.  This  progression  is  in  the  downward  dirrection  in  figure  3.  This 
produces  an  0(n)  algorithm  for  the  surface  spline. 

E  =  1 

FOR  R  *  1  to  (n*s)  DO 

FORALL  Pj  k  ,1  s.j^s,3^k^n  DO 

c  =  R  mod  s 
IF  (c  =  0)  c  -  s 
d  -  R  -  (E)*S-1  +  2 

T2j,k  <=  M2C)d_2  (v2Cid.2) 

T1  j,k  ^  ^ 1  c,d-1  (y1c,d-l) 

Tj,k  ^  M0(j  (vc  c|) 

Gi.k  *  pd,j,k 

Hj,k  *=  pd-1,j,k-1 

'j,k  *  pd-2,j,k-2 

al  -  lj,k  *  M2j,k-2<u2j,k-2>  '  l2j,k 

a2  *  'j,k  *  w2j,k-2<u2j,k-2)  ‘  t1  j,k 


a3  “  'j.k  *  M2j,k-2(u2j,k-2)  *  Tj,k 
bi  -  Hj)k  *  M2j,k-1(u2j>|c_1)  *  T2j(k 
b2  «  Hj  k  *  M2j(k.i(u2j>k.1)  *  T1  jfk 
b3  =  Hj  k  *  M2jtk.1(u2j|k.1)  *  Tj(k 
ci  -Gj>k‘M2j>k(u2jfk)‘T2jfk 
c2  «=  Gj  k  *  M2jjk(u2j  k)  *  T1  jfk 

c3  =  Gj,k  *  M2j,k(u2j,k)  *  Tj,k 
Bj  k  *  a-j  +  ag  33  +  bi  +  bg  +  bg  +  c-j  +  eg  +  eg 

WRITE{  Bj  k) 

END  FORALL 
IF  (R  mod  s  =  0)E  =  E  +  1 
END  FOR 

Notes: 

1.  The  d  subscript  in  the  control  point  vector  variable  ;  k  is  the  serial 

dimension.  The  vectors  are  repeated  in  each  row  of  the  processor  grid  ( 
pd,j,k  =  pd,j+1  ,k  -  pd,j+2,k  ••••  -  pd,s,k.  The  «  subscript  is  called  f  in  the 
above  discussion.  Since  the  Mj|k's  are  laid  out  similar  to  the  Njk's  the 
index  is  in  the  same  relative  position  in  the  grid  as  the  index  for  the 
Njk's* 

4.0  POLYGON  STORAGE  AND  TRIANGULARIZATION 

In  this  section  we  discuss  polygon  storage  schemes  that  lead  to  a 
natural  and  automatic  triangularization  of  CSG  primatives  and  also  for 
splines.  By  using  the  CM  to  store  1  point  of  a  polygon  per  processor, 
triangularization  patterns  can  be  generated  for  most  of  the  polygonal  CSG 
primatives  and  splines.  In  addition,  by  using  spline  surfaces  to  cover 
nonpolygonal  CSG  solids,  triangularization  patterns  can  be  generated  for 
cone,  cylinder  and  elipse  CSG  primatives.  For  the  nonpolygonal  CSG 
primatives  you  just  cut  the  spline  'skin'  and  unwrap  it  just  like  you 
unwrap  the  earth  when  rendering  it  on  the  pages  of  an  atlas.  For  polygonal 
CSG  primatives  consisting  of  4  to  8  points  storage  and  triangularization 
requires  10  processors  or  less  in  a  2  by  5  rectangular  grid.  Since  the 
triangularization  patterns  and  information  relating  to  adjacency  of 
triangles  in  the  pattern  can  be  stored  in  square  grid  subsets  of  the 
rectangular  grid,  the  CM  NEWS  interconnection  structure  can  be  used  to 
reduce  processor  interconnection  times.  In  the  case  of  spline  surfaces  or 
the  spline  'skins'  of  the  nonpolygonal  CSG  primatives,  storage  of  the 
points  and  triangularization  patterns  requires  a  Cu  by  Cv  rectangular  grid. 

Cu  is  the  number  of  points  calculated  using  parameter  u  and  equals 


(nu-o+1)*s+1  Cv  is  the  number  of  points  calculated  using  parameter  v  and 
equals  (nv-o+1)*s+2(1  extra  to  account  for  the  vertex  duplication  in  the 

pattern  to  utilize  the  NEWS  network).  The  construction  of  complex 
nonpolygonal  or  spline  solids  by  concatenation  can  be  accomplished  by  the 
concatenation  of  the  spline  or  nonpolygonal  primatives  spline  'skin'  and 
the  storage  grids  can  be  placed  adjacent  to  one  another.  Note  that  the 
adjacency  structure  of  the  CM  grids  can  define  a  super  grid  for  a  complex 
solid.  CM  instructions  that  return  masks  or  'in  use'  patterns  can  be  useful 
in  bounding  boxes  type  calculations  for  solids  or  groups  of  solids.  Note 
that  the  vertex  duplication  is  a  case  of  trading  processors  for  time. 

4.1  POLYGON  STORAGE  AND  TRIANGULARIZATION  ALGORITHMS 

The  CM  can  be  used  to  automatically  decompose  CSG  primatives  and 
spline  surfaces  into  triangle  or  rectangle  components  without  any  further 
additional  operations  other  that  those  operations  required  to  input  the 
solids  into  the  CM  processors  from  the  front  end  or  those  operations 
required  to  transfer  or  expand  a  solid  from  one  rectangular  grid  to  another 
rectangular  grid.  This  automatic  triangularization  is  based  upon  regular 
storage  schemes  for  the  various  CSG  and  spline  primatives  which  produce 
identifiable  processor  patterns  within  the  rectangular  grids  which  make 
up  the  primatives  discussed  in  this  paper.  The  primary  type  of  complex 
solid  is  composed  of  spline  surfaces  or  combinations  of  spline  surfaces 
where  the  control  points  are  coincident  at  the  points  of  combination.  Each 
processor  in  this  discussion  has  for  each  polygonal  solid  allocated  to  the 
processor  the  following  variables.  Pj j  which  is  the  point  coordinate  for  a 

vertex  of  primative  k.  Sjj  contains  k.  Ej j  is  an  end  of  polygon  flag  used 
for  operations  on  complex  solids.  Xjj  and  Y  j  j  are  offsets  from  Pi  -j  and 

identify  the  i  and  j  index  of  the  processor  at  the  upper  left  corner  of  the 
grid  corresponding  to  primative  or  solid  k.  Pj  j  contains  vertex  1  of  the 

primative.  Xj  j  and  Yjj  are  just  to  flag  the  location  of  a  primative  inside 
of  a  large  processor  set  containing  many  primatives. 

Consider  the  following  algorithm  for  the  calculation  of  the  plane 
equations  of  all  the  rectangular  (equals  two  triangles)  surface 
components  of  an  n  by  m  spline  surface.  Figure  10  illustrates  the  storage 
and  triangularization  pattern  for  an  open  spline  surface  of  4  by  4  surface 
points.  All  the  variables  are  vectors.  The  id  of  the  spline  is  k.  The  numbers 
in  Figure  10  indicate  the  vertices  of  the  spline  rectangle  components.  The 
points  of  the  spline  triangle  components  are  stored  in  Pj  j  ,  Pjj+i  ,  pj,j+2 

>  ^i,j+3»  ^i.j+4  anc*  ^i+1,j  >  ^i+l.j+1  -  ^i+l.j+2  >  ^i+l.j+3,  Pi+l.j+4-  Note 
that  the  triangles  can  be  accessed  from  the  processors  in  the  grid 
containing  the  vertexes  connected  by  dashed  and  solid  lines  along  the  top 


and  bottom  of  the  grid  in  figure  4.  In  this  applications  algorithm  steps  3,4 
and  5  place  a  triangle  in  each  of  the  processors  Pjj+i  ,  pj,j+2  •  Pj,j+3, 

pi,j+4  and  Pj+i j  ,  Pj+i  j+i  .Pj+ij+2,  pi+1,j+3-  then  calculates  the 
equation  of  the  plane  containing  the  triangles.  Note  that  "My  is  a  vector 
with  x,y,z  coordinates  of  the  point  Py,  T 2  j  j  is  a  vector  with  x,y,z 
coordinates  of  the  point  Pj+i  j,  T^y  is  a  vector  with  x,y,z  coordinates  of 
the  point  Py.-j.  It  is  an  example  of  the  use  of  the  NEWS  network  to  take 

advantage  of  the  automatic  triangularization  of  a  spline  CSG  primative. 
The  plane  equation  is  Ax  +  By  +  Cz  +  D  ■  0.  Ryy  is  the  y  coordinate  of  the 

vector  Rjj.  The  situation  is  similar  for  Qy.  In  general,  the  triangle 

points  for  the  various  figures  will  reside  in  the  processors  that  are  at  the 
right  angle  vertex  of  the  patterns  in  the  grids  that  are  associated  to  a 
particular  CSG  primative  or  complex  solid.  Steps  3,4  and  5  of  the 
algorithm  show  the  use  of  the  NEWS  network  to  store  the  triangle  vectors 
in  the  processors  connected  with  the  solid  lines  in  the  grid  in  figure  10. 
Note  that  the  processors  along  the  sides  of  the  grid  have  a  bogus  triangle 
due  to  the  nature  of  the  FOR  ALL  statement  used  in  the  algorithm.  Steps  6 
to  11  calculate  the  coefficients  of  the  plane  equation  for  all  of  the 
triangles  of  the  CSG  4-point  primative  simultaneously.  Steps  8  to  10  are 
calculations  of  the  minors  of  the  cordinate  determinants.  Steps  6  and  7 
are  straight  vector  operations  that  can  take  advantage  of  the  CM  vector 
operations  instructions.  Remember  that  the  vectors  are  just  arrays  with  a 
SERIAL  dimension.  The  pseudocode  below  calculates  plane  equations  for 
the  solid  triangles  only.  The  calculation  of  the  plane  equations  of  the 
lower  dashed  triangles  is  similar  and  only  one  triangle  is  needed  for  the 
rectangle  plane  equation  for  each  of  the  rectangle  components  of  the 
surface. 


1)  FOR  ALL  Rj  j  where  Sy  -  k  DO 


2) 

TlU 

*pi.J 

3) 

T2i.i 

*  pi+i,j 

4) 

T\i 

*  pi,j-i 

5) 

Ri,j 

.  t2.  .  .  yl . 

1  l.J  I,J 

6) 

Qi,j 

-TVTli,j 

7) 

Ai.j 

-  Ryi.j  *  QZi,j  - 

8) 

■u 

-  -(RXi,j  *  °zi,j 

9) 

Ci,i 

-  RVQyi,j  - 

10)  Di,j 

11) END  FOR  ALL 

■  .  (  A  ■  *  T  1  • 
'  Ml,j  'XI 

RVQyu 


RV°xu> 

,yi,j  *  QXi,j 

)  +  (  Bj.j  *  Ty 
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FOUR  AND  SIX  POINT  POLYGON  STORAGE 
AND  TRIANGULA RIZATION  PATTERNS 
FIGURE  4 


Consider  the  case  of  a  five  or  eight  point  CSG  primative  of  the  form 
depicted  in  figure  5.  These  primatives  are  stored  in  a  2  by  5  processor 
grid.  The  points  of  the  primative  are  stored  in  P jj  ,  Pjj+i  ,  Pj(j+2  * 

pi,j+3  •  pi,j+4  ancl  pi+1,j  »  pi+1,j+1  •  pi+1,j+2  »  pi+1,j+3  »  pi+1,j+4-  Note 
that  the  triangles  can  be  accessed  from  the  processors  in  the  grid 
containing  the  vertexes  connected  by  dashed  and  solid  lines  along  the  top 
and  bottom  of  the  grid  in  figure  5.  The  above  algorithm  for  the  plane 
equations  is  applicable  to  this  case.  Figure  5  illustrates  the  storage 
pattern  and  triangularization  pattern  for  the  5  and  8-point  CSG 
primatives.  Observe  the  grid  pattern  in  figure  5.  Figure  4  is  an  illustration 
of  the  triangularization  patterns  for  four  and  six  point  CSG  primatives. 
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FIVE  AND  EIGHT  POINT  POLYGON  STORAGE 
AND  TRIANGULARIZATION  PATTERNS 
FIGURE  5 


If  you  unwrap  the  pyramid  by  cutting  along  edge  containing  verticies  1 
and  5  you  get  a  pattern  related  to  the  solid  line  triangle  pattern  in  figure 
5.  If  you  unwrap  the  rectangular  solid  then  you  get  a  pattern  related  to 
both  the  solid  and  dashed  line  patterns  in  figure  5.  Thus  you  can  think  of 
the  CSG  primatives  as  consisting  of  a  spline  surface.  Remember  that  sharp 
angles  between  faces  (  less  than  100  degrees  )  can  be  simulated  by 
multiple  control  points  in  the  patterns  in  figures  4  and  5.  Figure  10  is  an 
example  of  a  closed  spline  surface  consisting  of  a  4  by  4  point  surface. 
Thus  the  general  patterns  of  figures  4  and  5  can  be  generalized  to  splines 
of  arbitrary  size  and  shape.  Figure  6  is  an  example  of  a  nonpolygonal  CSG 


primative  whose  storage  and  triangularization  pattern  has  been 
transformed  to  the  spline  storage  and  triangularization  pattern.  Thus  the 
algorithm  for  plane  equation  calculation  can  be  generalized  to  all  of  the 
CSG  primatives  and  complex  spline  solids.  Figure  7  is  an  illustration  of  a 
part  of  an  open  spline  surface.  When  visualized  as  a  spline  surface  then 
figure  10  is  an  illustration  of  the  storage  and  triangularization  grid  for 
the  arbitrary  spline  surface. 


CONE  WITH  SIX  BY  FOUR  CONTROL 
POINT  MESH  OVERLAIN 
FIGURE  6 


FIVE  BY  FOUR  CONTROL  POINT  MESH 
FOR  ARBITRARY  B-SPLINE 
FIGURE  7 


Consider  the  combination  of  complex  solids  from  simpler  CSG 
primatives.  If  the  components  are  stored  according  to  the  parallel  scheme 
outlined  above  then  storage  of  the  combination  is  impractical  within  the 
CM  processor  grid.  The  only  practical  complex  primative  that  can  be  stored 
by  the  above  parallel  scheme  is  the  spline  surface.  An  open  spline  storage 
and  triangularization  scheme  is  easy  and  is  just  an  extension  of  figure  10 
to  include  larger  grid  dimensions.  For  an  open  spline  surface  consisting  of 
n  by  m  points  in  a  grid  then  the  processor  grid  size  required  within  the  CM 
is  of  dimension  n+1  by  m.  The  above  algorithm  for  the  plane  equation 
calculation  is  immediatelly  applicable  to  an  open  spline  surface.  Thus  any 
open  spline  can  be  triangularized  with  a  resulting  regular  pattern  to  the 
triangularization.  This  pattern  can  be  exploited  by  any  algorithm  that  does 
calculations  based  upon  the  triangle  components  of  a  surface.  A  closed 
spline  surface  can  utilize  this  storage  and  triangularization  scheme  by 
dissecting  it  into  a  series  of  open  spline  surfaces.  Each  open  spline 
component  will  retain  the  solid  identification  value  k.  For  applications 
which  require  calculations  from  triangle  members  of  two  or  more 
components  a  parallel  compare  of  the  T^  j  ,  T2, j  and  T3,  j  values  can 

identify  the  necessary  processors  containing  the  required  triangles. 


1 _ .2 _ 3  4 - 1 


5 _ 6 _ 7 - 8 - 5 


FIGURE  10 


5.0  SYMBOLIC  INPUT 

Symbolic  input  is  closly  related  to  the  inverse  spline  problem  in 
computer  graphics.  Given  an  algebraic  equation  as  input,  the  problem  is  to 
parse  the  equation,  calculate  the  surface  or  curve  points  and  then 
calculate  the  control  point  net  for  the  algebraic  surface  of  interest.  The 
corresponding  sequential  algorithm  for  the  inverse  spline  problem  is  from 
a  paper  by  Rodgers,  Satterfield,  and  Rodriguez[1].  A  brief  description  of 
the  algorithmis  is  as  follows.  Let  Q  and  B  be  column  matricies  (vectors). 
Let  C  be  a  r  by  t  matrix  where  t  equals  n*m  and  r  is  equal  to  the  number  of 
surface  points.  The  number  of  control  points  relating  to  the  u  parameter  is 
equal  to  n  and  the  number  of  control  points  relating  to  the  v  parameter  is 
m.  Q  contains  the  surface  points  and  B  contains  the  control  points.  Thus  Q 
-  C  *  B  is  the  equation  of  the  spline  surface.  Therefore  solving  for  B  we 
obtain: 

(X)  B  -  ((  CT  *  B  r1  *C  )  *  CT  *  Q. 


Where 


nv 

(XI)  C,j  -  z  Nj,k  (  Ujj  )  *  Mjik  (  Vjj  ). 

i-i 

The  Uj  j  and  Vjj  parameters  for  the  C  matrix  entries  are  calculated  as 

follows.  The  surface  points  form  a  (n-o+1)  by  (m-o+1)  grid.  If  stored  in 
column  major  order  then  each  collumn  of  points  was  produced  by  a  fixed 
value  of  u  and  varying  values  of  v.  similarly,  each  row  of  points  was 
produced  by  a  fixed  value  of  v  and  varying  values  of  u.  Thus,  u,-  j  and  Vj  j 

are  calculated  by  taking  the  ratios  of  i  to  n  and  j  to  m  and  calculating  the 
equivalent  ratios  of  Ujj  and  Vjj  to  the  maximum  values  of  u  and  v.  Then 

the  values  of  Ujj  and  Vjj  can  be  distributed  in  the  pattern  required  for  a 

similar  parallel  spline  algorithm  for  surfaces  and  the  blending  functions 
calculated.  There  exist  standard  algorithms  for  matrix  multiplication 
which  are  of  O(logn).  Matrix  transposition  algorithms  exist  which  are  of 
O(n)  for  a  n  by  n  matrix.  An  algorithm  exists  [2]  for  the  solution  of  a  block 
diagonal  system  that  is  of  O(nlogm).  The  CM  has  instructions  for  matrix 
multiplication  and  matrix  transposition.  Using  the  algorithm  of  [2]  and  the 
CM  matrix  instructions,  B  can  be  calculated  in  O(nlogm).  There  exists  an 
algorithm  for  the  calculation  of  the  control  point  net  for  a  m  by  n  bicubic 
B-spline  patch  of  0(logm  +  logn)[3].  The  O(nlogm)  estimate  assumes  that 
the  CM  instructions  are  of  the  same  order  as  the  software  algorithms 
reported  in  the  literature. 


For  the  problem  of  parsing  the  original  arithmetic  input  to  produce  the 
surface  values  input  to  the  inverse  spline  algorithm  there  are  plenty  of 
algorithms  for  parallel  parsing  based  upon  the  standard  sequential  infix  to 
postfix  algorithm  of  Horowitz  and  Sahni.  Most  of  these  algorithms  are  of 
time  complexity  O(logn),  with  n  equal  to  the  length  of  the  infix  expression. 
The  number  of  processors  required  is  of  the  order  O(n). 
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